26 research outputs found

    Action recognition based on 2D skeletons extracted from RGB videos

    Get PDF
    In this paper a methodology to recognize actions based on RGB videos is proposed which takes advantages of the recent breakthrough made in deep learning. Following the development of Convolutional Neural Networks (CNNs), research was conducted on the transformation of skeletal motion data into 2D images. In this work, a solution is proposed requiring only the use of RGB videos instead of RGB-D videos. This work is based on multiple works studying the conversion of RGB-D data into 2D images. From a video stream (RGB images), a two-dimension skeleton of 18 joints for each detected body is extracted with a DNN-based human pose estimator called OpenPose. The skeleton data are encoded into Red, Green and Blue channels of images. Different ways of encoding motion data into images were studied. We successfully use state-of-the-art deep neural networks designed for image classification to recognize actions. Based on a study of the related works, we chose to use image classification models: SqueezeNet, AlexNet, DenseNet, ResNet, Inception, VGG and retrained them to perform action recognition. For all the test the NTU RGB+D database is used. The highest accuracy is obtained with ResNet: 83.317% cross-subject and 88.780% cross-view which outperforms most of state-of-the-art results

    Reactive Statistical Mapping: Towards the Sketching of Performative Control with Data

    Get PDF
    Part 1: Fundamental IssuesInternational audienceThis paper presents the results of our participation to the ninth eNTERFACE workshop on multimodal user interfaces. Our target for this workshop was to bring some technologies currently used in speech recognition and synthesis to a new level, i.e. being the core of a new HMM-based mapping system. The idea of statistical mapping has been investigated, more precisely how to use Gaussian Mixture Models and Hidden Markov Models for realtime and reactive generation of new trajectories from inputted labels and for realtime regression in a continuous-to-continuous use case. As a result, we have developed several proofs of concept, including an incremental speech synthesiser, a software for exploring stylistic spaces for gait and facial motion in realtime, a reactive audiovisual laughter and a prototype demonstrating the realtime reconstruction of lower body gait motion strictly from upper body motion, with conservation of the stylistic properties. This project has been the opportunity to formalise HMM-based mapping, integrate various of these innovations into the Mage library and explore the development of a realtime gesture recognition tool

    Human-Centered Machine Learning

    Get PDF
    Machine learning is one of the most important and successful techniques in contemporary computer science. It involves the statistical inference of models (such as classifiers) from data. It is often conceived in a very impersonal way, with algorithms working autonomously on passively collected data. However, this viewpoint hides considerable human work of tuning the algorithms, gathering the data, and even deciding what should be modeled in the first place. Examining machine learning from a human-centered perspective includes explicitly recognising this human work, as well as reframing machine learning workflows based on situated human working practices, and exploring the co-adaptation of humans and systems. A human-centered understanding of machine learning in human context can lead not only to more usable machine learning tools, but to new ways of framing learning computationally. This workshop will bring together researchers to discuss these issues and suggest future research questions aimed at creating a human-centered approach to machine learning

    Adaptive training of Hidden Markov Models for stylistic walk synthesis

    No full text
    Figure 1: Postures taken from four different synthesized styles (from left to right: sad, afraid, drunk and decided walks

    Robust and automatic motion-capture data recovery using soft skeleton constraints and model averaging.

    No full text
    Motion capture allows accurate recording of human motion, with applications in many fields, including entertainment, medicine, sports science and human computer interaction. A common difficulty with this technology is the occurrence of missing data, due to occlusions, or recording conditions. Various models have been proposed to estimate missing data. Some are based on interpolation, low-rank properties or inter-correlations. Others involve dataset matching or skeleton constraints. While the latter have the advantage of promoting a realistic motion estimation, they require prior knowledge of skeleton constraints, or the availability of a prerecorded dataset. In this article, we propose a probabilistic averaging method of several recovery models (referred to as Probabilistic Model Averaging (PMA) in this paper), based on the likelihoods of the distances between body points. This method has the advantage of being automatic, while allowing an efficient gap data recovery. To support and validate the proposed method, we use a set of four individual recovery models, based on linear/nonlinear regression in local coordinate systems. Finally, we propose two heuristic algorithms to enforce skeleton constraints in the reconstructed motion, which can be used on any individual recovery model. For validation purposes, random gaps were introduced into motion-capture sequences, and the effects of factors such as the number of simultaneous gaps, gap length and sequence duration were analyzed. Results show that the proposed probabilistic averaging method yields better recovery than (i) each of the four individual models and (ii) two recent state-of-the-art models, regardless of gap length, sequence duration and number of simultaneous gaps. Moreover, both of our heuristic skeleton-constraint algorithms significantly improve the recovery for 7 out of 8 tested motion-capture sequences (p < 0.05), for 10 simultaneous gaps of 5 seconds. The code is available for free download at: https://github.com/numediart/MocapRecovery
    corecore